Breaking out of the Black-Box: Research Challenges in Data Mining

نویسنده

  • Padhraic Smyth
چکیده

Database researchers, statisticians, and \data owners" often have quite di erent views of data. In a database context the traditional goal has been to provide a general and exible data management framework, with less concern about the content of the data. Statisticians on the other hand have traditionally focused primarily on issues of data modeling and inference with relatively little concern for where the data physically reside or how the data will be accessed. Data owners, in turn, tend to be more focused on using the data as a means to an end: business data owners want to increase revenue by developing better predictive models, and scienti c data owners typically want to develop insight into the phenomena generating the data. In this brief position paper we take a broad-scale view of \what people do with their data" (the data owner's perspective) and use this viewpoint to identify current opportunities and challenges for data mining research.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Opening the Black Box : How Data

Opening the Black Box: How Data Mining Works with examples for Social Scientists in Higher Education Research Terrence Willett © 200

متن کامل

A Personalized Recommender System Based on Explanation Facilities Using Collaborative Filtering

Collaborative filtering (CF) is the most successful recommendation method, but its widespread use has exposed some limitations, such as sparsity, scalability, and black box. Many researchers have focused on sparsity and scalability problem but a little has tried to solve the black box problem. Most CF recommender systems are black boxes, providing no transparency into the working of the recomme...

متن کامل

Towards Continuous Behavior Mining

With new advances in Cyber-Physical Systems (CPS) and Internet of Things (IoT), more and more discrete software controllers interact with continuous physical systems. Workflow models are a classical approach to define controllers. However, the effect of the associated actions that are activated by executing the workflow may not spontaneously be realized but have to be realized over time. Genera...

متن کامل

Maturating, Enlarging and Breaking Dormancy of In Vitro Lilium Bulblets

One of the most effective ways to propagate Lilium is by using tissue culture techniques, preferably through bulblet production. In addition to the surviving percentage of Lilium bulbs after transplantation, the small size of in vitro bulblets compared to commercial bulbs and bulblets‟ dormancy are the most important constraints to commercializing Lilium micropropagation. Different concentratio...

متن کامل

Using Pig as a data preparation language for large-scale mining software repositories studies: An experience report

The Mining Software Repositories (MSR) field analyzes software repository data to uncover knowledge and assist development of ever growing, complex systems. However, existing approaches and platforms for MSR analysis face many challenges when performing large-scale MSR studies. Such approaches and platforms rarely scale easily out of the box. Instead, they often require custom scaling tricks an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001